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Abstract 

Background: Protection of public health from rabies is informed by the analysis of surveillance data from human 
and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal 
level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. 
Aggregation of agency-specific data into one database application would enable more comprehensive data 
analyses and effective communication among participating agencies. In Quebec, RageDB was developed to house 
surveillance data for the raccoon rabies variant, representing the next generation in web-based database 
applications that provide a key resource for the protection of public health. 

Results: RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of 
raccoon rabies in Quebec. Technological advancements of RageDB to rabies surveillance databases include 1) 
automatic integration of multi-agency data and diagnostic results on a daily basis; 2) a web-based data editing 
interface that enables authorized users to add, edit and extract data; and 3) an interactive dashboard to help 
visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from 
citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate 
public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources 
for protecting public health. 

Conclusions: RageDB provides an example in the evolution of spatio-temporal database applications for the 
storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to 
develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The 
database increases communication among agencies collaborating to protect human health from raccoon rabies. 
Furthermore, health agencies have real-time access to a wide assortment of data documenting new developments 
in the raccoon rabies epidemic and this enables a more timely and appropriate response. 



Background 

Rabies is a worldwide threat to public health, killing 
more than 55,000 people annually[l]. In North America, 
the raccoon variant of this virus has resulted in the lar- 
gest wildlife zoonotic on record [2-4]. Though specifi- 
cally adapted to raccoons, raccoon rabies can spillover 
into other mammals, including humans, through contact 
with infected saliva [5]. If not promptly treated, the 
rabies virus causes fatal encephalitis in nearly 100% of 
the human cases. Raccoon rabies was first observed in 
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Florida in the 1940's [5] and then in the 1970's a second 
outbreak in West Virginia led to the current distribution 
of the disease in eastern North America [2,3] (Figure 1). 
Rabies is a significant public health concern because dis- 
ease reservoirs occur in rural and urban areas of eastern 
North America, enabling rabid wildlife to infect humans 
directly or indirectly through domestic animals. 

North American public health, agriculture and wildlife 
management agencies spend millions of dollars each 
year to protect the public and reduce raccoon rabies 
incidence [6,7]. The success of programs to control and 
prevent rabies depends on effective disease surveillance. 
In essence, disease surveillance requires sampling ani- 
mals from populations of interest and diagnosing their 
disease status to estimate public health risk. More 
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extensive surveillance may include collecting informa- 
tion on precise sample locations (e.g., georeferenced to a 
point location rather than a county centroid), sampling 
date, and biological characteristics of the animal sample 



(e.g., species, sex, general health condition). This infor- 
mation can then be used to generate more accurate and 
comprehensive epidemiological measures for quantifying 
the epidemic (e.g., prevalence, rate of spread) by 
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accounting for factors that may affect the interpretation 
of surveillance data (e.g., biases in sampling methods). 
Ideally, surveillance programs enable a thorough under- 
standing of the epidemic for estimating the health risk 
and for devising effective disease control strategies. 

The effectiveness of surveillance programs to contri- 
bute data that are informative for risk assessment and 
disease control depends on the responsible agencies. 
Surveillance for diseases like rabies that affect public 
and animal health is often performed by multiple agen- 
cies. Even though the focus of an individual agency may 
be exclusively human, domestic animal or wild animal 
health, an understanding of all are needed for the pro- 
tection of public health. A multi-agency approach for 
surveillance is advantageous by combining data from 
agencies with different mandates, typically resulting in a 
wider assortment of data collected over space and time. 
For instance, in Quebec, five organisations participate to 
raccoon rabies surveillance. At the federal level is the 
Canadian Food Inspection Agency (CFIA), and the pro- 
vincial agencies include the Ministere de F Agriculture 
des Pecheries et de FAlimentation du Quebec 
(MAPAQ), Ministere des Ressources naturelles et de la 
Faune (MRNF), Institut national de sante publique du 
Quebec (INSPQ), and the Centre quebecois sur la sante 
des animaux sauvages (CQSAS) at the Universite de 
Montreal. The CFIA, MAPAQ, and INSPQ are domes- 
tic animal and human health organizations, while the 
MRNF and CQSAS are concerned with the manage- 
ment and health of wild populations. The CFIA diag- 
nose animals collected during passive surveillance that 
may have exposed people or domestic animals to rabies, 
and additionally record information on sample timing 
and location. The INSPQ and regional public health 
agencies also participate in passive surveillance 
through collaboration with the CFIA. The MAPAQ is 
responsible for recording the location and frequency of 
citizens reporting rabies suspect animals, which may or 
may not result in an investigation to test disease status. 
The MRNF processes samples from their disease con- 
trol programs (e.g., population reduction) and 
enhanced surveillance of samples collected to monitor 
the disease status of wild populations (e.g., road mor- 
talities). The CQSAS conducts pathological analyses on 
samples and records information on weight, status of 
other diseases and parasite load, in addition to docu- 
menting typical biological attributes recorded for rabies 
surveillance (i.e., species, sex, age, rabies status a ). Thus, 
there clearly can be a wide assortment of data available 
for designing effective disease management programs. 
However, to appropriately use data pooled from multi- 
ple agencies it is necessary to account for biases 
that can result from agency-specific sampling 
characteristics. 



There are several ways in which agency surveillance 
data may differ. Firstly, agency-specific sampling designs 
may range in the spatial extent of their surveillance 
zone and the distribution and frequency of sampling 
events within the zone. Since rabies incidence varies 
across the landscape [8] and cycles in time [9,10], suffi- 
cient samples should be collected to capture the varia- 
tion underlying the spatio-temporal characteristics of an 
epizootic. Failure to do so can bias the accuracy of epi- 
demiological measures over space and time relative to 
the sampling design [11]. Secondly, the type of surveil- 
lance method used by agencies can lead to an under- or 
overestimation in observed prevalence. For example, 
surveillance through citizen notification surveillance 
detects a higher proportion of infected animals than 
found by trapping programs [12]. Citizens are more likely 
to report animals behaving abnormally from rabies 
infection than healthy animals, whereas traps typically 
capture healthy animals rather than those at the later 
stages of infection that are too sick to be attracted by or 
encounter traps. Thirdly, the geographic resolution of 
sample locations may vary among agencies, constraining 
the spatial scale for data interpretation. Sample locations 
may be geo-referenced by a global positioning device 
(GPS), municipal address, or to the centre of the asso- 
ciated town, county or census tract. It is always possible 
to georeference sample locations to larger spatial units, 
but not vice versa. Thus, the scale of the sample loca- 
tion affects the scale of the data interpretation. For 
example, rabies incidence data collected by the New 
York State Department of Health georeferenced to the 
street address are appropriate for accounting for the 
effect of human population density on observed rabies 
incidence per census tracts (e.g. [8]), but are too coarse 
for assessing the effects of micro-scale habitat features 
such as the interface of corn agriculture and forest on 
the rabies epidemic (e.g., [12]). Thus, the agency-specific 
differences that can affect data interpretation must be 
taken into account for the appropriate use of data. 

Combining agency-specific surveillance data into one 
data repository is extremely beneficial for objective ana- 
lysis and communication of surveillance findings. The 
design of the database can ensure that data are explicitly 
characterised by factors potentially affecting interpreta- 
tion (e.g., level of geographic resolution for sample loca- 
tion). It is then the responsibility of the user to 
appropriately account for these factors. Also, use of a 
single data repository facilitates faster, coordinated 
access to data from agencies. This allows for more com- 
prehensive analyses and communication of findings than 
possible with only an agency-specific database, and 
negates wasting human resources resulting from the 
creation of multiple independent and often redundant 
databases. The US pioneered an approach for housing 
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rabies surveillance data into a single repository through 
the creation of RabID [13], an Internet accessible map- 
ping application of rabies cases for public health agen- 
cies. Though advanced for mapping queried data by 
time period or species, there were no facilities for sum- 
marising queried data in other formats (e.g. plot, table) 
or for summarising a wider variety of surveillance data 
characteristics (e.g. type of sampling method, number of 
samples tested for rabies). Furthermore, the integration 
of data required a series of steps by personnel and auto- 
mated processes to transfer, format and validate data. In 
2008, the Raccoon Rabies Scientific Committee for the 
Government of Quebec initiated development of their 
own next generation database, RageDB, to combine data 
from the CFIA, MAPAQ, MRNF, and CQSAS. 

In this paper we outline the major advancements of 
RageDB: an automated data integration process from 
multiple agencies, next generation Web 2.0 applications 
for data-editing and mapping, and the storage of citizen 
participation data in rabies surveillance, to which, we 
comment on the value of including these data. We also 
discuss the role of RageDB in the creation of future spa- 
tio-temporal real-time web database applications. 

Results 

Multi-agency Foundation of RageDB 

The Raccoon Rabies Scientific Committee is multi-sec- 
torial and interdisciplinary. Committee members are 
from or have close association to all agencies participat- 
ing in raccoon rabies surveillance in Quebec. Therefore 
as facilitated through the committee, data storage and 
security needs for each agency stakeholder were taken 
into account during the development of RageDB. 

Database Components 

RageDB consists of three main components (Figure 2). 
First, SQL Server (Microsoft Corporation, v2008, Red- 
mond, WA, USA) relational database management sys- 
tem (RDBMS) is used as the storage tier. Second, 
agency-provided data are automatically integrated into 
SQL Server using the community (open-source) edition 
of Pentaho Data Integration platform (PDI; Pentaho 
Corporation, v4.1, Orlando, FL, USA). Finally, a Web 
application acts as the main user-interface for further 
data entry and editing, data exploration and extraction, 
as well as system administration (e.g., managing user 
accounts and privileges for data operations). 
-Automated Data Integration 

RageDB's automated data integration of multi-agency 
data is one of the key advancements in rabies surveil- 
lance repositories, as it demonstrates a simple but effec- 
tive way of incorporating data from participating 
agencies with minimal investments from their informa- 
tion technology resources. The automated data 



integration process begins by having agencies extract 
relevant data from their own database systems and then 
depositing the data into their agency's secure FTP site. 
The agencies deposit their data at defined intervals and 
in an acceptable format of their choice (i.e., ASCII or 
CSV files, Microsoft Excel spreadsheets). PDI scripts, 
running under Microsoft Windows' scheduling facility, 
are then used to automatically access the data files from 
the FTP sites, transform the data to be compatible with 
RageDB's data structure, and then insert the results into 
the database. Success or failure notifications regarding 
the data integration are automatically sent to the 
RageDB administrators using PDI's logging and email 
functions. PDI scripts are also used to regularly backup 
the database to a remote location managed by the 
Groupe de recherche en epidemiologic des zoonoses et 
sante publique (GREZOSP), Universite de Montreal in 
addition to the database's own daily backup. 
-Data Validation 

Data validation was critical for building RageDB to con- 
tend with the circumstance of combining data from 
multiple agencies that differed in their data management 
protocols. Since the overall architecture of the system 
allows multiple data-entry points through the web-inter- 
face and the Pentaho scripts, data validation could not 
be limited to the web-based data entry interface. There- 
fore, in addition to the normal data integrity constraints 
provided by the relational database management system, 
all data validation rules were implemented as small sub- 
routines (called stored procedures) within the SQL Ser- 
ver. This advantageously eliminates the possibility of 
having an external process bypassing the validation 
rules. 

Auditing and logging routines were also designed to 
track all modifications made to the database. These 
include the date and time of the modification and the 
values (i.e., past and current), and the identity of the 
user or automated process that made the modifications. 
Thus, by recording modifications in an audit table, 
RageDB administrators can restore data resulting from 
any erroneous edit and provide a history of all edits 
made to any piece of information. 
-Summary & Cartographic Applications 
The Web component of RageDB is composed of 2 main 
sections. The first is the data-entry and editing interface, 
a form-based website based on Microsoft's ASP.NET 
MVC framework (Microsoft Corporation, v.2.0, Red- 
mond, WA, USA). The second section is the interactive 
dashboard, which is a pure HTML-based Web 2.0 appli- 
cation, developed using Javascript, and requiring no pro- 
prietary plugins. The dashboard was tested to be 
compatible with current Web browsers (e.g., Internet 
Explorer versions >7, Mozilla Firefox, Apple Safari, Goo- 
gle Chrome, Opera). 
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One of the key design goals in developing the dashboard 
was to provide users with a visually simple, yet attractive 
and interactive experience with the data, geared towards 
real-time data access for communication and decision- 
making. For more in-depth analyses or access to the com- 
plete dataset, the data- extraction facility of the main web- 
site can be used to output Excel or text files for subsequent 
data-processing in the users software of choice. 

The dashboard screen is divided into 2 main areas (Fig- 
ure 3). The sidebar on the left contains controls for filter- 
ing data by date, regional county municipality (MRC) 
and species. Users can select the summary format from 
the tabs on the main panel for presenting the surveillance 
data. The summary options are: (1) four pie charts show- 
ing the proportion of animals that were (i) collected per 
different sampling methods, (ii) tested for rabies, (iii) hav- 
ing a positive or negative rabies diagnosis, and (iv) dis- 
covered living or dead at the time of sampling; (2) a 
histogram for the frequency of citizens reporting suspect 
animals per week or month; (3) a table summarising the 
number of diagnoses and the rabies status of animals per 
sampling method; and (4) a map displaying sample loca- 
tions (Figure 4). The map view uses OpenLayers' open- 
source Javascript mapping library with the Google 



Terrain layer as the background. Users can zoom in or 
out of the study area and choose to symbolize sample 
locations according to the disease status or health state 
of animals upon sample collection (i.e., alive or dead). To 
keep the map readable at any scale given the filtering 
options selected by the user, samples are clustered spa- 
tially and displayed as graduated symbols; symbol size 
increases relative to the number of samples clustered at 
the location. The clustering is handled dynamically on 
the server, according to the map scale and a static toler- 
ance parameter that represents the minimum distance in 
pixels, at the given map scale, between each cluster. 

To keep the user-interface responsive, the dashboard 
asynchronously communicates with the database 
through a Web Application Programming Interface 
(API). User interactions are sent to the server through 
the Web API and then forwarded to be processed by 
the SQL Server, with only the summarized results being 
returned to the Web application. Thus, bandwidth usage 
is minimized and the browser is prevented from proces- 
sing large amounts of data. 

-Benefits for Including Data from Citizen Notifications 

Citizens in Quebec are encouraged to report wild or 
domestic animals suspected of having rabies because 
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this is the most effective surveillance method for detecting 
rabid animals [12]. Suspect animals include those that are 
found dead, behaving abnormally or with physical signs of 
aggressive encounters with other animals (e.g., lesions 
from bites). Citizens make a report through a phone num- 
ber or website dedicated for rabies surveillance. The 
MAPAQ receives all citizen notifications and determines 
whether or not to dispatch an MRNF technician to collect 
the animal for rabies diagnosis. RageDB advantageously 
stores data on all citizen notifications, irrespective of 
whether the suspect animal was sampled for testing, 
though this distinction is also made in the database. Public 
health officials can use data on the total number of citizen 
notifications to measure public involvement in reporting 
suspect animals over time. This is important because citi- 
zens are more vigilante in participating at the onset of a 
known epidemic, but tend to become more complacent 
over time about disease presence. Therefore, regional 
monitoring of citizen participation can be used to identify 
areas where public participation is waning and then appro- 
priate strategies can be devised to increase public 



participation (e.g., public service announcements in local 
newspapers and radio stations). 

Future Directions 

RageDB is being continually developed to increase its 
value and functionality for public health. Participating 
agencies have already benefited from having quick 
access to all raccoon rabies surveillance data for aiding 
in policy development and research. Stringent security 
measures and user-restricted access makes RageDB a 
valuable tool for future storage of additional confidential 
data such as information on human and domestic ani- 
mal exposures, and post-prophylaxis records. Further- 
more, the success of RageDB for aiding management 
and research of raccoon rabies in Quebec has initiated 
the MRNF, in partnership with the CQSAS, to develop 
a database for housing surveillance data from all wildlife 
diseases monitored in Quebec. RageDB's data integra- 
tion and storage architecture, and facilities for querying, 
summarising and mapping data are being used to guide 
the development of future wildlife disease databases. 
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Figure 4 RageDB dashboard cartographic output. An example of cartographic output displaying the locations of animals tested for raccoon 
rabies in Quebec as specified by species (raccoons) and time period (April 1 st 2007 to April 1 st 2011). Surveillance locations are depicted by red 
circles (contain > 1 rabid sample) and blue circles (contain no rabid samples) and increase in size relative to the number of samples tested for 
rabies (as noted within the circle). 



Conclusions 

RageDB was built quickly (less than 3 months for the data- 
base itself and the web-editing interface) and inexpensively 
(approximately $30,000 Cdn) by using open-source tech- 
nologies, simple and efficient design strategies, and shared 
web hosting. RageDB automatically integrates new surveil- 
lance data daily, provides a user-friendly interface for real- 
time viewing, summarising, and extraction of data. The 
collaborative effort from the multi-sectorial and interdisci- 
plinary Raccoon Rabies Scientific Committee development 
team resulted in a multi-agency data repository that satis- 
fied the needs of all stakeholders. 

Having a common data repository for participating 
agencies is advantageous on many fronts. The value of 
surveillance data are increased because standardized 
validation and entry protocols ensure accurate data inte- 
gration and storage, hence, maintains data accuracy for 
the duration of RageDB's existence. Easy access to view 
and analyse data from multiple agencies increases sur- 
veillance related communication among agencies and 
enables decision-makers to react rapidly to new develop- 
ments. Increased communication among the agencies 
also facilitates multi-agency and inter-disciplinary colla- 
boration for the protection of public health and the pre- 
vention of future rabies outbreaks. Furthermore, easy 
access to data collected from all participating agencies 
has enabled analyses that could not be accomplished by 



solely relying on agency-specific data. These include stu- 
dies that estimated the risk of detecting raccoon rabies 
on the landscape [12], providing an evaluation of the 
efficacy of surveillance methods to detect rabies [12], 
and determining landscape factors associated with rac- 
coon abundance (Houle et al, unpublished observations) 
and seropositivity of oral vaccination (Mainguy et al., 
unpublished observations). 

The future of rabies surveillance and indeed, disease 
monitoring in general, will become increasingly more 
effective for informing disease control programs as web- 
based data repository technologies evolve. In time, we 
expect more agencies will realise the benefits of securely 
and accurately storing their data among multiple agen- 
cies for the benefits of sharing data to achieve the com- 
mon goal of protecting public health. 

Endnotes 

a Since 2011 the Centre quebecois sur la sante des ani- 
maux diagnoses the rabies status of animal samples 
using a direct rapid immunohistochemical test [14], for 
which the diagnosis is later confirmed by the Canadian 
Food Inspection Agency. 
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